Sentence Fusion for Multidocument News Summarization

نویسندگان

  • Regina Barzilay
  • Kathleen McKeown
چکیده

A system that can produce informative summaries, highlighting common information found in many online documents, will help Web users to pinpoint information that they need without extensive reading. In this article, we introduce sentence fusion, a novel text-to-text generation technique for synthesizing common information across documents. Sentence fusion involves bottom-up local multisequence alignment to identify phrases conveying similar information and statistical generation to combine common phrases into a sentence. Sentence fusion moves the summarization field from the use of purely extractive methods to the generation of abstracts that contain sentences not found in any of the input documents and can synthesize information across sources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring Strategies for Sentence Ordering in Multidocument News Summarization

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input art...

متن کامل

Thai News Text Summarization and Its Application

Since Thai language lacks word/phrase/sentence boundaries, document summarization in Thai needs investigations in unit segmentation, unit selection, redundancy removal and evaluation dataset construction. In this work, we have proposed Thai Elementary Discourse Unit (TEDU) and a three-stage method of Thai multidocument summarization, i.e., unit segmentation, unit-graph formulation, and unit sel...

متن کامل

Applying two-level reinforcement ranking in query-oriented multidocument summarization

Sentence ranking is the issue of most concern in document summarization today. While traditional featurebased approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graphbased ranking algorithms (such as the PageRank-like algorithms) recursively ...

متن کامل

Sentence Clustering-based Summarization of Multiple Text Documents

With the rapid growth of the World Wide Web, information overload is becoming a problem for an increasingly large number of people. Automatic Multidocument summarization can be an indispensable solution to reduce the information overload problem on the web. This kind of summarization facility helps users to see at a glance what a collection is about and provides a new way of managing a vast hoa...

متن کامل

Sub-Event Based Multi-Document Summarization

The production of accurate and complete multiple-document summaries is challenged by the complexity of judging the usefulness of information to the user. Our aim is to determine whether identifying sub-events in a news topic could help us capture essential information to produce better summaries. We used six methods to create multi-document summaries and then compared them to find which method ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2005